An Information Retrieval System for Improving Efficiency in Scientific Literature Searches

نویسنده

  • Daniel M. Dunlavy
چکیده

Conducting scientific research most often involves a search through existing literature in order to avoid repeating research efforts, review methods already developed for solving a problem, gain a better understanding of a problem, etc. Typically, this search is performed using the Internet, which is a convenient portal to various databases of books, journal articles, technical reports, preprints, etc. In using this approach for information retrieval (IR), a researcher has the advantage of being able to search through great amounts of reference material. However, along with this great access comes the challenge of efficiently retrieving and processing only relevant material. When using an information retrieval engine to search through electronic resources, simple queries can return too many documents or documents not relevant to the intended search criteria. For example, if a physicist is researching which methods have been used to solve problems in the area of plasma physics, a search for “methods plasma physics” on the World Wide Web using Google (www.google.com) yields more than 27,000 documents (out of approximately 2.5 billion). The physicist could never hope to read through all of those documents. Even searching arXiv (xxx.lanl.gov), a server for preprint articles in physics, for this same information on plasma physics yields more than 350 documents (out of approximately 135,000). Still, this is too much information to process. Just reading the abstracts of those 350 articles would require a great deal of time, and there is no guarantee that this document set would be representative enough to constitute a thorough literature search. With these issues in mind, the goal of this project is to develop an information retrieval system that

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Information Retrieval System for Improving Efficiency in Scientific Literature Searches: Progress Report #1

This reports details the work completed in the first three months of development of the QCS (Query, Cluster, Summarize) information retrieval (IR) system. In the first section, the project as it was originally proposed is briefly presented. (For a more detailed account of the proposal, an interested reader is directed to the original proposal [Dun].) Sections 3, 4, and 5 give the details of the...

متن کامل

Improved Skips for Faster Postings List Intersection

Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...

متن کامل

Improved Skips for Faster Postings List Intersection

Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...

متن کامل

دروازه اطلاعات علمی،‌پژوهشی، و فناورانه ایران: خدمتی نوین برای پژوهشگران ایرانی

Information Subject Gateways are providing access to the necessary quality controlled databases among the vast resources for users of the web and saving them from the confusion and perplexity among the sources on the web. The main objective of this research is creating Iranian Gateway for Scientific, Research, and Technological Information as a valuable source for use by academics and researche...

متن کامل

بررسی تأثیرات ریشه‌یابی در بازیابی اطلاعات در زبان فارسی

Using the language-specific behavior in information retrieval systems can improve the quality of the retrieved results significantly. Part of the word that remains after removing its affixes is called stem. Stemming process can be used for improving the relevancy of the results in information retrieval system. Different morphological variants of words (plural, past tense…) will be mapped into t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002